Concurrent and fail-safe replicated simulations on heterogeneous networks: An introduction to EcliPSe

نویسندگان

  • Felipe Knop
  • Edward Mascarenhas
  • Vernon Rego
  • Vaidy S. Sunderam
چکیده

This paper presents an overview of the ACESparallel software sysremand, in particular, an introduction to the EcliPSe layer of the system. The ACES system is a fault-tolerant, layered software system for heterogeneous-network based cluster computing. The EcliPSe toolkit, which resides on an upper layer, was constructed specifically for replication-based and domain-decomposition based simulation applications. Ir is not, however, restricted to simulations and supports any message-passing fonn of parallel processing. By raking advantage of networks of heterogeneous machines, generally "idle" workstations, EcliPSe programs can achieve supercomputer level perfonnance with little programming effort. This was a motivating factor in EcliPSe's design. We present an overview of key application-level features in EcliPSe, a new user interface, support for fault-toleram simulation, and perfonnance results for rhree simple bur large scale and representative experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Effectiveness of Superconcurrent Computations on Heterogeneous Networks

Concurrent computing on networked collections of computer systems is rapidly evolving into a viable technology that is attractive from the economic, performance, and availability perspectives. Several software infrastructures that support such heterogeneous network-based concurrent computing have evolved, and are in use for production-quality high-performance computing. In this paper, we descri...

متن کامل

Third-order Decentralized Safe Consensus Protocol for Inter-connected Heterogeneous Vehicular Platoons

In this paper, the stability analysis and control design of heterogeneous traffic flow is considered. It is assumed that the traffic flow consists of infinite number of cooperative non-identical vehicular platoons. Two different networks are investigated in stability analysis of heterogeneous traffic flow: 1) inter-platoon network which deals with the communication topology of lead vehicles and...

متن کامل

Fail-safe concurrency in the EcliPSe system

Local or wide-area heterogeneous workstation clusters are relatively cheap and highly effective, though inherently unstable operating environments for long-running distributed computations. We found this to be the case in early experiments with a prototype of the EcliPSe system, a software toolkit for replicative applications on heterogeneous workstation clusters. Hardware or network failures i...

متن کامل

Energy-Aware Probabilistic Epidemic Forwarding Method in Heterogeneous Delay Tolerant Networks

Due to the increasing use of wireless communications, infrastructure-less networks such as Delay Tolerant Networks (DTNs) should be highly considered. DTN is most suitable where there is an intermittent connection between communicating nodes such as wireless mobile ad hoc network nodes. In general, a message sending node in DTN copies the message and transmits it to nodes which it encounters. A...

متن کامل

Available fail-safe systems

Continuity of service and cost-effectiveness are adding new challenges to life critical systems over and above the underlying safety concerns. The introduction of redundant components is a necessary condition for increasing the overall system availability with respect to physical component failures. Here we consider redundancy by means of replicating fail-safe components in a distributed real-t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Simul. Pr. Theory

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1995